Goto

Collaborating Authors

 sum-product algorithm



Accumulator Networks: Suitors of Local Probability Propagation

Neural Information Processing Systems

One way to approximate inference in richly-connected graphical models is to apply the sum-product algorithm (a.k.a. The sum-product algorithm can be directly applied in Gaussian networks and in graphs for coding, but for many condi(cid:173) tional probability functions - including the sigmoid function - di(cid:173) rect application of the sum-product algorithm is not possible. We introduce "accumulator networks" that have low local complexity (but exponential global complexity) so the sum-product algorithm can be directly applied. In an accumulator network, the probability of a child given its parents is computed by accumulating the inputs from the parents in a Markov chain or more generally a tree. After giving expressions for inference and learning in accumulator net(cid:173) works, we give results on the "bars problem" and on the problem of extracting translated, overlapping faces from an image.


Temporal Parallelization of Inference in Hidden Markov Models

Hassan, Sakira, Särkkä, Simo, García-Fernández, Ángel F.

arXiv.org Machine Learning

This paper presents algorithms for parallelization of inference in hidden Markov models (HMMs). In particular, we propose parallel backward-forward type of filtering and smoothing algorithm as well as parallel Viterbi-type maximum-a-posteriori (MAP) algorithm. We define associative elements and operators to pose these inference problems as parallel-prefix-sum computations in sum-product and max-product algorithms and parallelize them using parallel-scan algorithms. The advantage of the proposed algorithms is that they are computationally efficient in HMM inference problems with long time horizons. We empirically compare the performance of the proposed methods to classical methods on a highly parallel graphical processing unit (GPU).


Data-Driven Factor Graphs for Deep Symbol Detection

Shlezinger, Nir, Farsad, Nariman, Eldar, Yonina C., Goldsmith, Andrea J.

arXiv.org Machine Learning

Many important schemes in signal processing and communications, ranging from the BCJR algorithm to the Kalman filter, are instances of factor graph methods. This family of algorithms is based on recursive message passing-based computations carried out over graphical models, representing a factorization of the underlying statistics. Consequently, in order to implement these algorithms, one must have accurate knowledge of the statistical model of the considered signals. In this work we propose to implement factor graph methods in a data-driven manner. In particular, we propose to use machine learning (ML) tools to learn the factor graph, instead of the overall system task, which in turn is used for inference by message passing over the learned graph. We apply the proposed approach to learn the factor graph representing a finite-memory channel, demonstrating the resulting ability to implement BCJR detection in a data-driven fashion. We demonstrate that the proposed system, referred to as BCJRNet, learns to implement the BCJR algorithm from a small training set, and that the resulting receiver exhibits improved robustness to inaccurate training compared to the conventional channel-model-based receiver operating under the same level of uncertainty. Our results indicate that by utilizing ML tools to learn factor graphs from labeled data, one can implement a broad range of model-based algorithms, which traditionally require full knowledge of the underlying statistics, in a data-driven fashion.


Model-Based Detector for SSDs in the Presence of Inter-cell Interference

Yassine, Hachem, Badiu, Mihai-Alin, Coon, Justin

arXiv.org Machine Learning

In this paper, we consider the problem of reducing the bit error rate of flash-based solid state drives (SSDs) when cells are subject to inter-cell interference (ICI). By observing that the outputs of adjacent victim cells can be correlated due to common aggressors, we propose a novel channel model to accurately represent the true flash channel. This model, equivalent to a finite-state Markov channel model, allows the use of the sum-product algorithm to calculate more accurate posterior distributions of individual cell inputs given the joint outputs of victim cells. These posteriors can be easily mapped to the log-likelihood ratios that are passed as inputs to the soft LDPC decoder. When the output is available with high precision, our simulation showed that a significant reduction in the bit-error rate can be obtained, reaching $99.99\%$ reduction compared to current methods, when the diagonal coupling is very strong. In the realistic case of low-precision output, our scheme provides less impressive improvements due to information loss in the process of quantization. To improve the performance of the new detector in the quantized case, we propose a new iterative scheme that alternates multiple times between the detector and the decoder. Our simulations showed that the iterative scheme can significantly improve the bit error rate even in the quantized case.


Belief Propagation for Continuous State Spaces: Stochastic Message-Passing with Quantitative Guarantees

Noorshams, Nima, Wainwright, Martin J.

arXiv.org Machine Learning

The sum-product or belief propagation (BP) algorithm is a widely used message-passing technique for computing approximate marginals in graphical models. We introduce a new technique, called stochastic orthogonal series message-passing (SOSMP), for computing the BP fixed point in models with continuous random variables. It is based on a deterministic approximation of the messages via orthogonal series expansion, and a stochastic approximation via Monte Carlo estimates of the integral updates of the basis coefficients. We prove that the SOSMP iterates converge to a \delta-neighborhood of the unique BP fixed point for any tree-structured graph, and for any graphs with cycles in which the BP updates satisfy a contractivity condition. In addition, we demonstrate how to choose the number of basis coefficients as a function of the desired approximation accuracy \delta and smoothness of the compatibility functions. We illustrate our theory with both simulated examples and in application to optical flow estimation.


Counting in Graph Covers: A Combinatorial Characterization of the Bethe Entropy Function

Vontobel, Pascal O.

arXiv.org Artificial Intelligence

We present a combinatorial characterization of the Bethe entropy function of a factor graph, such a characterization being in contrast to the original, analytical, definition of this function. We achieve this combinatorial characterization by counting valid configurations in finite graph covers of the factor graph. Analogously, we give a combinatorial characterization of the Bethe partition function, whose original definition was also of an analytical nature. As we point out, our approach has similarities to the replica method, but also stark differences. The above findings are a natural backdrop for introducing a decoder for graph-based codes that we will call symbolwise graph-cover decoding, a decoder that extends our earlier work on blockwise graph-cover decoding. Both graph-cover decoders are theoretical tools that help towards a better understanding of message-passing iterative decoding, namely blockwise graph-cover decoding links max-product (min-sum) algorithm decoding with linear programming decoding, and symbolwise graph-cover decoding links sum-product algorithm decoding with Bethe free energy function minimization at temperature one. In contrast to the Gibbs entropy function, which is a concave function, the Bethe entropy function is in general not concave everywhere. In particular, we show that every code picked from an ensemble of regular low-density parity-check codes with minimum Hamming distance growing (with high probability) linearly with the block length has a Bethe entropy function that is convex in certain regions of its domain.


The Phylogenetic Indian Buffet Process: A Non-Exchangeable Nonparametric Prior for Latent Features

Miller, Kurt T., Griffiths, Thomas, Jordan, Michael I.

arXiv.org Machine Learning

Nonparametric Bayesian models are often based on the assumption that the objects being modeled are exchangeable. While appropriate in some applications (e.g., bag-of-words models for documents), exchangeability is sometimes assumed simply for computational reasons; non-exchangeable models might be a better choice for applications based on subject matter. Drawing on ideas from graphical models and phylogenetics, we describe a non-exchangeable prior for a class of nonparametric latent feature models that is nearly as efficient computationally as its exchangeable counterpart. Our model is applicable to the general setting in which the dependencies between objects can be expressed using a tree, where edge lengths indicate the strength of relationships. We demonstrate an application to modeling probabilistic choice.


Dual Decomposition for Marginal Inference

Domke, Justin (Rochester Institute of Technology)

AAAI Conferences

We present a dual decomposition approach to the tree-reweighted belief propagation objective. Each tree in the tree-reweighted bound yields one subproblem, which can be solved with the sum-product algorithm. The master problem is a simple differentiable optimization, to which a standard optimization method can be applied. Experimental results on 10x10 Ising models show the dual decomposition approach using L-BFGS is similar in settings where message-passing converges quickly, and one to two orders of magnitude faster in settings where message-passing requires many iterations, specifically high accuracy convergence, and strong interactions.


Efficient Bayesian Inference for Dynamically Changing Graphs

Sumer, Ozgur, Acar, Umut, Ihler, Alexander T., Mettu, Ramgopal R.

Neural Information Processing Systems

Motivated by stochastic systems in which observed evidence and conditional dependencies betweenstates of the network change over time, and certain quantities of interest (marginal distributions, likelihood estimates etc.) must be updated, we study the problem of adaptive inference in tree-structured Bayesian networks. We describe an algorithm for adaptive inference that handles a broad range of changes to the network and is able to maintain marginal distributions, MAP estimates, and data likelihoods in all expected logarithmic time. We give an implementation of our algorithm and provide experiments that show that the algorithm can yield up to two orders of magnitude speedups on answering queries and responding to dynamic changesover the sum-product algorithm.